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Abstract 

Background Complex networks can often be decomposed into less complex sub-networks whose struc- 
tures can give hints about the functional organization of the network as a whole. However, these structural 
motifs can only tell one part of the functional story because in this analysis each node and edge is treated 
on an equal footing. In real networks, two motifs that are topologically identical but whose nodes perform 
very different functions will play very different roles in the network. 

Methodology/Principal Findings Here, we combine structural information derived from the topology 
of the neuronal network of the nematode C. elegans with information about the biological function of 
these nodes, thus coloring nodes by function. We discover that particular colorations of motifs are signif- 
icantly more abundant in the worm brain than expected by chance, and have particular computational 
functions that emphasize the feed-forward structure of information processing in the network, while evad- 
ing feedback loops. Interneurons are strongly over-represented among the common motifs, supporting 
the notion that these motifs process and transduce the information from the sensor neurons towards the 
muscles. Some of the most common motifs identified in the search for significant colored motifs play a 
crucial role in the system of neurons controlling the worm's locomotion. 

Conclusions/ Significance The analysis of complex networks in terms of colored motifs combines two 
independent data sets to generate insight about these networks that cannot be obtained with cither data 
set alone. The method is general and should allow a decomposition of any complex networks into its 
functional (rather than topological) motifs as long as both wiring and functional information is available. 



Introduction 

Over the last decades, systems biology and network theory have contributed tremendously to our un- 
derstanding of complex systems (l}|4], revealing for example that the topological architecture of the 
molecular interaction networks within a cell is shared to a large degree by other complex systems, such as 
the Internet, computer chips and society [2]. This insight led to the development of various quantitative 
tools in network theory to analyze the complex structures within biological networks. 

Complex networks like electronic circuits are frequently represented in terms of modules such as op- 
erational amplifiers, logical gates and memory, and it is often suggested that biological networks can 
similarly decomposed into functional modules that have stereotypical functions (5j[6]. Because the detec- 
tion and identification of modules is a notoriously difficult task \T, a different approach focuses on the 
identification of conserved network motifs [8 - 10 , that is, sub- networks of small size (typically two to five 
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nodes) that are significantly more abundant in a network compared to a control network that had its 
edges randomly rearranged. The idea behind looking for significant motifs is evolutionary in nature: those 
motifs that are conducive to the function of the organism within its environment will be preferentially 
maintained over motifs that are either neutral in function or even detrimental. For example, an analysis 
of structural motif abundances in a variety of biological networks shows that these abundances can in 
part be explained by the motifs' robustness to small perturbations [11^ . This thinking equally applies to 
technological systems that do not evolve according to strict Darwinian rules. For example, a comparison 



of motif abundances in biological, technological, social, and even word-adjacency networks 12 shows 
that these networks can be grouped into clusters that share similar motif abundance profiles. 

We analyze motifs in the network of synaptic and gap-junction connections of the neuronal network 
of the nematode C. elegans. This network controls one of the most well-understood complex biological 
systems to date, and most of the network architecture of the 302 neurons of the hermaphrodite worm is 



known from experimental work 13 14 as well as recent reconstructions 15 . The most up-to-date wiring 
information covers 279 neurons of the somatic nervous system, excluding 20 neurons of the pharyngeal 
system and three neurons that appear to be unconnected from the rest [15) . There are 3,606 edges between 
these nodes, of which some (the synaptic connections) are directed, while gap-junctions are undirected. 

An analysis of topological motifs in this network has revealed that two major building blocks are 
significantly overrepresented in the C. elegans neuronal network: the feedforward loop, and the bi-fan 



motif 16 . It is believed that these motifs perform stereotypic functions and play a crucial role in the 



nematode's descision-making and control 15 17 18 . However, while there is support for the hypothesis 



that over-represented motifs point to biological function from the evolutionary conservation of motifs in 

on 



the yeast protein-protein interaction network 19 , these conclusions have also been questioned 20 
the grounds that topology alone does not contain enough information to predict the function or process, 
or how biochemical reactions are likely to proceed in biological systems [9j. Indeed, the identification 

'q ^ 

W ^ 

Figure 1. Significance of uncolored vs. colored motifs. (A): Non-significant motif from an 
uncolored analysis [16] becomes highly significant (B) if colors are used to attach functional tags to the 
nodes. Green: sensor neurons, red: interneurons, blue: motor neurons. 

of the feedforward and bifan motifs does not allow us to determine how these motifs are used, or how 
they contribute to the worm's behavior. A simple example can illustrate this point: in Fig. [l|\, we show 
a three-node motif that was not found to be significantly overrepresented in previous analyses [16}|18| . 
However, if we color each neuron according to three possible functional tags such as motorneuron (blue), 
sensor neuron (green), or interneuron (red), several colored motifs stand out with high significance (see 
below), among which the motif shown in Fig.[Tj3. The functional significance of this motif is immediately 
obvious: it relays sensory information via an interneuron towards a muscle. Indeed, previous studies have 
shown that the connections between neurons of the three types chosen here are heavily biased: neurons 



do not connect indiscriminately between types 18 21-23 . Also, an analysis of colored motifs using GO 



annotations in the yeast protein-protein interaction network |24| suggests that differently colored motifs 
are differentially evolutionarily conserved, pointing to a diversity of functional roles for motifs with the 
same structure. 

Here we combine two important data sets for a systematic analysis of the topological and functional 
motifs in the C. elegans brain: the connection graph and the functional characterization of each neu- 
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ron 15 . Using these datasets, the entire C. elegans neuronal network becomes a colored graph where 
nodes represent neurons, edges are connections between neurons, and the color of the node tags the 
cell-type of the node. It is clear that the choice of the cell-type set (the colors) is crucial for the success of 
this method, and different choices will produce different results. At the same time, the classification into 
the three cell types is in itself ambiguous, because there are differences of annotation in the literature, 
and some cells are sometimes annotated as belonging to two classes. Other classifications exist (such as 
into ten different morphological classes [25]), but the motif analysis of graphs with more than three colors 
quickly becomes computationally cumbersome. Here, we study the abundance distribution of colored 
directed motifs of sizes two to four nodes. The number of possible motifs in a network strongly depends 
on the size of the motif, whether edges are directed, and the number of colors used to tag the nodes (see 
Table 1). 

While an identification of functional motifs can help us understand how the worm uses its neuronal 
network for signal transduction, we should keep in mind that the worm also uses extrasynaptic signaling 



for behavior 26 . Furthermore, several different molecules can modulate synaptic function at a single 
neuron ^7]. Thus, some of the computation that translates signals into actions takes place outside of the 
connection graph proper, and cannot be explored via a motif analysis. 

Table 1. Comparison of number of possible colored motifs in C. elegans neuronal network 





size 2 


size 3 


size 4 


UM(1) 


1/1 


2/2 


6/6 


DM(1) 


2/2 


13/13 


199/199 


UM(2) 


3/3 


10/10 


50/50 


UM(3) 


6/6 


28/28 


201/201 


DM(2) 


7/7 


86/86 


2,818/2,818 


DM(3) 


15/15 


262/273 


8,310/13,770 



The table shows the actual numbers of colored motifs of a given size (and directedness) found in C. 
elegans neuronal network as well as the theoretically possible number of colored motifs as a pair of 
numbers (actual/possible). UM(1): undirected motifs, uni-colored, DM(1): directed motifs, uni-colored, 
UM(2): undirected motifs, two colors, DM(2): directed motifs, two colors, and so on. 



Results 

Adaptive significance of colored motif distribution 

If the coloration of a motif (that is, the identity of colors at different positions of the motif) has adaptive 
significance, we should see a bias in the colored motif distribution with respect to control networks whose 
color assignments have been scrambled. We extract colored motif abundances from the colored networks 
by counting all distinct color combinations for each of the structural motifs of size 2, 3, and 4. Of the 
279 neurons, 86 are classified as sensor neurons (and colored green in the following), 80 are classified as 
interneurons (red), and the remainder of 114 neurons are classified as motorneurons (blue) [l5]. We stress 
again that the classification of some neurons is uncertain because other groups f2^ have classified some 
neurons as belonging to two types simultaneously, and some neurons' classification is tentative. However, 
the results presented here do not vary significantly if a few neurons are misclassified. 

In order to determine whether the abundance of a particular colored motif in C. elegans is biased, we 
produce random colored control networks by shuffling the color assignments in the C. elegans network 
while maintaining the relative abundance of each kind. The mean abundance Nji of colored motifs of 
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a particular type for 1,000 independent randomizations then provides the unbiased expectation for that 
motif, which we compare with the actual count Nqe obtained for the colored worm brain. In Fig. [2j 
we plot the logarithm (base 2) of the ratio Nce /N^r for each colored motif as a function of the random 
count Nji, to determine the extent to which the worm motifs are over- or underrepresented. Most of 
the motif counts in C. elegans are significantly different from the random control: all of the 2-node 
colored motif counts are significant, and all but one of the three-node motifs (one-sample two-tailed 
t-test, P < 0.05). Of the 4-node motifs, only 156 of the observed 8,310 motifs are not significantly 
different from the control count at the 5% level. We find a tendency of colored motifs in C. elegans to be 
under-represented compared to a randomly colored control, but with a significant number of motifs that 
are found much more often than expected by chance. (The distribution of normalized z-scores is strongly 
biased towards suppression, but with a long tail indicating over-expressed motifs, see Supplementary 
Fig. SI). This finding suggests that the majority of possible colored motifs are not useful or downright 
detrimental, but a handful of them are so useful that they appear between 2 and 60 times as often as in 
an average randomly colored network. Note that some motifs that readily appear in the random controls 
are completely absent in C. elegans: 11 colored motifs of size three and 5,460 motifs of size 4 do not 
appear at all, which is also significantly different from what is expected by chance: at most 5 motifs of 
size 3 (1.02 on average) and 3,667 motifs of size 4 (2,634 on average) were absent by chance in any of the 
1,000 randomizations. 



Two- node motifs 



In previous work that analyzed structural motifs only 16-18 , the undirectional two-node motif was found 
to be unremarkable, while the bi-directional motif was deemed over-represented T6p8 with respect to an 
ensemble of edge-randomized networks. We can look at both of those motifs in terms of the exceptionality 
of their colorations. In Fig.|3]we show the measured counts of each of the color realizations of the directed 
(Fig. [sj^) and undirected (Fig. [3^3) motifs. 

These distributions show that the observed functional constraints make intuitive sense. For example, 
we find the motor-to-sensor-neuron motif to be significantly suppressed: we do not expect muscles to relay 
information to sensory neurons in a functioning worm (even though some of these connections do indeed 
exist). On the other hand, the sensor-to-inter- as well as inter-to-inter-neuron motifs appear significantly 
more often than expected by chance, as appropriate for information-processing motifs. 



Motifs as computational building blocks 



Previous work identified the feed-forward motif as significantly over-represented 14 16 18 29 in the 
C. elegans brain, as well as is gene regulatory networks (8[[9j. We find that while many feed-forward 
motifs with colors are also over-represented, many others appear not to be useful. Whether a numerical 
over-representation (as measured, for example, by z-scores) is statistically significant must be determined 
carefully, by correcting for multiple hypothesis-testing (as pointed out earlier, T6p8] ) because it is possible 



that any individual motif's abundance can appear to be significantly diiferent from the randomization 
control purely by chance. We have generalized the step-down min-P procedure [16[|18| to colored motifs 
(see Methods) to calculate the multiple-hypothesis-corrected P-values for the colored motifs of size 3 and 
4. Using 100,000 color randomizations of the C. elegans network, 40 motifs of size 3 (out of a possible 
273, see Table 1) have a significant P-value at the 5% level (26 of which have a corrected P < 0.002) and 
are shown in Fig. [4j while 505 out of the 8,310 observed motifs of size 4 have a corrected P — 0.055 for 
100,000 randomization (shown in Fig. S2). Because the number of independent hypotheses for motifs 
of size 4 is so large (13,770), the corrected P-values depend on the number of randomizations, and can 
only become significant if the number of randomizations significantly exceeds the number of hypotheses. 
As a consequence, the P-values for these 505 size-4 motifs will dip below the 5% level if the number of 
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Figure 2. Differential representation of colored motifs. Compaxison of the colored motif counts 
NcE obtained from the C. elegans neuron network and the average count from 1,000 color-randomized 
network, Nj^. Points above the zero line represent the colored motifs with higher frequency in the 
worm's neuronal network compared to color- randomized networks (over-representation), while those 
below that line are suppressed. A: colored directed motifs of size 2, B: colored directed motifs with 
three nodes, C: colored directed motifs with 4 nodes. Logarithm is to the base 2. 



randomizations is increased even further, and we will treat the set of 505 motifs with corrected P = 0.055 
as our set of significantly over-represented motifs of four neurons. 
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Figure 3. Colored motif abundances. Histogram of abundances of directed structural motifs of two 
neurons with particular coloration in C. elegans (black) compared to the average abundance in 1,000 
color randomizations of the same network (grey), green: sensory neuron, red: interneuron, blue: motor 
neuron. A: directed pairs (the direction of information flow is Icft-to- right). B: undirected pairs. 



Motifs of size three 

In Fig. [I) we show the 40 significantly over-represented motifs of size three, starting with the forward- 
processing motif (a relay chain) from sensor- to inter- to motor-neuron, already shown in Fig. [l] That 
this motif is the most notable among all motifs with three neurons confirms that the overall structure 



of the chemical synapse network is a three-layer architecture 15 . The second-most over-used motif is 
also a relay-chain into the motor neuron, but from another interneuron, suggesting that this is just the 
3-neuron end of a 4-neuron chain that starts with a sensor neuron. And indeed, that chain does appear 
among the significant motifs of size 4 (see below). The "beginning" of that chain also appears among 
the significant 3-neuron motifs. The only other over-represented purely directional chain (using chemical 
synapses only) is the interneuron chain. 

The motif with the third-highest z-value is a feed-forward motif of three interneurons. Feed-forward 
motifs have different uses in computation, depending on whether the feed-forward signal is excitatory or 
inhibitory. Often, these motifs are used to control activation only when input is present [oj , or to perform 
"perfect adaptation" to constant signals While feed-forward motifs have previously been identified 
as important in C. elegans, it is noteworthy that the most-used type consists of interneurons only, even 
though they are in the minority among neuron types. In fact, the list of motifs in Fig. [4] is clearly 
dominated by interneurons, (69% of the neurons in the list of 40 motifs, compared to only about 29% of 
all neurons in the full network of 279). This imbalance suggests that the motifs represent computational 
building blocks that describe the information-processing task: while sensors and motors serve mainly as 
signal sources and sinks, interneurons work as the signal transducers. 

Another highly significant feed- forward motif has the signal originating in a sensor-neuron (see Fig. [4]). 
Many of these feed-forward motifs come in alternate versions where one of the edges is an undirected gap 
junction, but they are never the most common. This is to be expected as there are far fewer undirected 
edges (514) than directed edges (2,194). The computational purpose of a feed-forward motif using a bi- 
directional gap junction is not immediately obvious, but it is possible that this back connection (or many 
back connections) are meaningful in information-processing by providing the opportunity of feedback. 
Note also that the "ring" motif where three nodes feed a signal into each other is absent among the 
significantly over-used motifs even though among three-node motifs with three edges, 40% should be 
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Figure 4. Significant motifs of size 3. Most over-represented colored motifs of size 3 in the C. 
elegans neuronal network, with their z-values. These motifs all have the min-P adjusted P < 0.05. 



uncolored motifs 16 18 , some color combinations are significantly over-used as apparent from Fig. [4] 
Their purpose becomes more apparent when considering the four-node motifs. 



Motifs of size four 

The larger the motif, the more specific its computational function. At the same time, the number of 
possible motifs also increases greatly with motif-size. Of the 13,770 possible colored motifs with four 
nodes and directed edges, only 8,310 actually appear in the C. elegans network. We estimate that the 
number of possible colored motifs of size 5 is in the millions, preventing a significance analysis. As in the 
size-3 motifs, interneurons are significantly enriched within the computational motifs (68% of neurons in 
size-4 significant motifs, compared to the baseline abundance of 29% in the network as a whole). 

For four nodes with directed edges, there are 199 possible motifs that are structurally different, but 
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many of those topologies are not prominent among the 505 colored motifs that are most significantly 
over-represented (shown in Supplementary Fig. S2). Among those, we distinguish five functional classes 
of motifs using chemical synapses (directed edges) only, shown in Fig. [5j These classes cover a significant 
portion, but not all of the 199 possible structural motifs. (When motifs have undirected edges, they 
sometimes straddle two classes of motifs.) 

The most common motif-class is the nested feed- forward motif (Fig. [5]A_) , of which there are several 
kinds. About 40% of the most significantly over-represented motifs fall in this class. They are distin- 
guished from bi-fan motifs (Fig. [5JI)) by the number of nodes with highest in-degree but no out-degree: 
bi-fans have two output nodes with an in-degree of two (see the example in Fig. [6]A_) while nested feed- 
forwards usually have a single output node with an in-degree of two or three. Only about 5% of the 
motifs among our list of 505 are bi-fans according to this definition. The second-most common group of 
motifs (about 25%) are feed-forward loops with entry or exit (Fig. [5^), followed by the "integration and 
bifurcation" motifs (Fig. [5]3, about 20%), and the relatively rare bi-fans, followed finally by the forward 
chain (5%, Fig. [5^). 

Functional motifs that we do not discuss (about 5% of the motifs in the set of 505) either do not 
show up among the 505 prominent motifs or else are under-represented. An example is the "nested rings" 
motif, an instance of which is shown in Fig. [6^3). The relative absence of ring motifs in the network could 
imply that feedback via the neuronal connection graph is not extensively used for computation by the 
worm. 

In the following, we discuss the most common colorations of the motifs in each of the classes, their 
possible computational function, and point out some of these motifs in a model of the C. elegans sub- 



network used for forward locomotion (Fig. k3p), described in 31 



Nested feed-forward loops 

In this class of motif, one or two inputs are fed forward through one or two relay neuron towards a single 
output (see examples in Fig. [5]A_). Among the top-ten colored motif types by z-value, this motif appears 
six times (see Supplementary Fig. S2). We can see several motifs of this class in the reconstruction of 
the core network for C. elegans locomotion [31 , which models the undulatory behavior of the worm with 
a biomechanical model based on the connection structure of the C. elegans neuronal network. We show 
nine of the core network nodes and their connections in Fig. [6fl), colored according to our convention. 
The nodes named "Xv" and "Xd" are representatives of a class of interneurons (SAA) that connect in 
the manner shown to the ventral and dorsal head stretch receptors "Sv" and "Sd" . Similarly, the motor 
neurons labelled "VB" and "VD" are representatives of 18 such neurons f2^. The "AVB" and "PVC" 



neurons are representatives of the "master controllers" for forward locomotion 32 . The reconstruction 
is noteworthy because it can infer that some of the connections are inhibitory rather than excitatory. 
The control of PVC and AVB via the SAA neurons (Xd) in Fig. is a good example of a nested feed- 
forward motif, as is the control of DB via the relays PVC and AVB with Xd as the source (but note that 
because there are both synaptic (directed) and undirected edges between Xv and AVB, the motifs are 
not strictly only feed- forward). Examples of highly-represented motifs of this sort are shown in Fig. [6^5, 
along with their motif number and z-value as seen in Supplementary Fig. S2. Feed-forward motifs can 
be nested in different manners, processing two inputs in parallel, or a single input sequentially or in a 
hierarchical manner (see Fig. [5|\) . All these motifs appear about equally in colorations that have sensor- 
or interneurons as the source, and inter- or motorneurons as the signal sink. 

Feed-forward with entry or exit 

A feed-forward loop with a signal connecting to the input, output, or relay-neuron (see sketches in Fig.[5|3) 
is the 2nd most frequent structure among the most-significant colored motifs of size 4. Most commonly, the 
output of the feed-forward loop (the signal- neuron) is directly connected to a motor neuron, highlighting 
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Figure 5. Five classes of over-represented colored motifs of size 4. A: nested feed-forward 
motifs, B: feed-forward motifs with entry and exit, C: integrations and bifurcations, D: bi-fan motif 
with or without coupling of the inputs, and E: linear chains. 

the use of the feed-forward motif in controlling locomotion (four out of the top five colored motifs in this 
class are of this sort). A feed-forward loop consisting only of interneurons with its output directed into a 
motor neuron is the motif with the two highest z-values among the 505 most-significant colored motifs. 
The third-most common motif in this class has the entry inter-neuron of the feed-forward loop replaced 
with a sensor-neuron. Several of the motifs in this class can readily be seen in the forward locomotion 
network Fig.[6j3. There are no "ring" motifs with entry or exit among the 505 significantly over-abundant 
motifs of this class. 



Integrations and bifurcations 

This class of motifs (examples are depicted in Fig. ^p) is relatively straightforward as there is no feed- 
forward or feed-back of signals. Rather, signals are either distributed via bifurcations or integrated. The 
most common motif in this class is a sensor neuron connected to a chain of three interneurons, followed by 
the integration of a sensor- and an interneuron which is fed into a motor-neuron, shown in Fig.[6p. The 
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Figure 6. Colored motifs and network context. A: A common colored bi-fan motif. B: A nested 
ring motif that is uncommon in C. elegans. C: A signal integration motif driving a single output. D: 
The core of the locomotion network of C. elegans, after 31 . Nodes are colored according to the scheme 



used throughout. Arrows with single points are excitatory connections via a chemical synapse, while 
edges ending in a bar signal inhibition via a chemical synapse. Edges with two arrow heads denote gap 
junctions. E: A selection of significant four-node motifs that appear in the locomotion network shown in 
D. Below each motif appears the rank and z-score as in Supplementary Figure S2. Note that we 
included motifs that have a synaptic junction (directed link) between Xv or Xd and AVB, because both 
synaptic and gap junction (undirected) edges exist between those neurons. 



most common bifurcation motif is a sensor-neuron feeding an inter-neuron, whose signal is distributed 
over two motor-neurons. Neither of these motifs can be found within the forward locomotion network 
Fig[6fl) or its extensions, so we assume that they are part of another important pathway for C. elegans 
behavior. 



Bi-fan motif 



The bi-fan motif (Fig.jsjD) is a well-known motif structure (see, e.g., ^) that regulates two independent 
outputs using two inputs. The computational function of any bi-fan motif depends on whether the 



connections are excitatory or inhibitory 20 , and on whether the inputs themselves are connected. While 
the motif is used sparingly in C. elegans, some colorations are absolutely essential to the worm's behavior. 
Indeed, the bi-fan motif controlling the motor neurons VB and VD via PVC and AVB (see Fig. [615, 



Colored Motifs in C. elegans 



11 



rightmost motif) happens to be the third-most over-represented motif (by z-value) of ah size-4 motifs 
(see Fig. [6^ as well as Fig. S2). We note, however, that the version where inputs do not communicate 
(first type in Fig. [sjl)) is used much more rarely. 

Relay chains 

Relay chains of four nodes such as depicted in Fig. [5]E are comparatively more rare. The most over- 
represented such chains are the forward processing chain from a sensor- into a motorneuron via two 
interneurons (which appears in Fig. [6p) or from three interneurons into the motor neuron, followed by 
variations on the theme with gap junctions replacing the chemical synapses. The relative rarity of the 
4-node forward chain underscores how important signal integration and feed-forward processing is for the 
worm. 



Discussion 

Combining two sets of data that make the neuronal network of C. elegans one of the best understood 
animal control structures known, namely the connection map between neurons and the functional char- 
acterization of each neuron, allows us to gain insight into the computational building blocks of the worm 
brain by determining the over- representation of colored motifs, with respect to a color-randomization of 
a network with the connectivity unchanged. We find that while certain structural motifs have previously 



been found to be significant with respect to an edge randomization of the network 16 18 , many more 
colored motifs are highly significant. Indeed, the overall trend is the suppression of nonsensical motifs, 
such as signal chains where muscles feed information into sensors, or inter-neurons deliver signals to 
sensors. The motifs that are used significantly more often than predicted by chance, as determined by 
a multiple-hypothesis-corrected test, are easily identified as important elements in a signal-processing 
network. Sensor-neurons are almost always sources of signals (their in-degree is significantly higher than 
their out-degree) , while motor neurons are most often the end of the signal chain. Interneurons represent 
a much larger fraction of nodes in computational motifs than their overall abundance in the network 
would imply, suggesting that they relay the bulk of the forward-processing information. Notably absent 
among the common motifs are feedback loops, and relay chains longer than three neurons, underscoring 
the need for immediate reaction and the integration of signals. While these observations cannot take 
into account whether the connections between neurons are excitatory or inhibitory (as this data is not 
available for the majority of the connections), the comparison with the core forward-locomotion network 
of C. elegans (where this inference has been made) suggests that an analysis of colored motif utilization 
captures the computational processes underlying that behavior well. 

In the future, we imagine that an analysis of the utilization spectrum of colored motifs can be extended 
to any network where nodes can be assigned tags that differentiate their biological (or social) function, 
but care must be taken to limit the number of functional classes as the predictive power of this approach 
is quickly overwhelmed when too many motifs are possible. 



Methods 

Motif abundances and color randomization 

The wiring diagram as well as the functional classification of neurons into sensor- inter- and motor- 
neurons, was obtained from [15| . Networks were encoded in terms of an adjacency matrix A{i,j) where 
A{i,j) = 1 if a chemical synapse connects neuron i to j, with A{j,i) — 0. Undirected edges (gap junc- 
tions) have A(i,j) = A{j,i) = 1. Colored motifs counts were obtained using our own implementation of 



the FANMOD algorithm 133 ]34 . We define the z-score of a C. elegans motif as z = {Nce — Nii)/a, 
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where Nce is the abundance for that motif in the C. elegans network, Nn is the average of the abun- 
dance distribution of that motif in color-randomized networks, and a is the standard deviation of that 
distribution. Color randomization of the C. elegans colored graph is performed by repeatedly switching 
the colors of two randomly chosen nodes, thus preserving the color distribution and the underlying graph 
topology. The color switch is repeated sufficiently often to guarantee a random color distribution. 



Multiple hypothesis testing 

We are testing the hypothesis that a colored motif in the C. elegans neuronal networks is significantly 
over-represented, compared to the same motif in a color-randomized network. Because many hypotheses 
are tested simultaneously, the probability of rejecting the null hypothesis for any motif by chance at least 
once increases with the number of hypotheses tested. To correct for this, we adapted the single-step min- 
P procedure for multiple-hypothesis adjustment [35{[36] that was also used by 16p8 as follows. For each 



size class of motifs, let Ncsii) be the count of colored motif i among the M possible colored motifs, and 
Nji(p) be the motif count for the same motif i in the pth color-randomization of the C. elegans network 
(p = 1 • • • S*), where S is the number of randomizations. Using the Heaviside (step)-function definition: 

r 1 x > , 

0{x) = <^ a; < , (1) 
[ 0.5 X = , 

we define the raw P-value for each C. elegans colored motif i as 

1 

PcfjW = ^5]0(iVK(z,r)-iVc£(z)) . (2) 

r— 1 

We also define the raw P-value for any randomization r of motif i 

1 ^ 

P{^:r) = - ^ e{Nn{i,p)-Nn{i,r)) (3) 

p^r—l 

to obtain the most significantly over-represented randomization (by chance) among the colorations i 

P,nin{r) ^mmP{r,i) . (4) 

i 

Finally, the single-step min-P adjusted P-value for motif i, 7r(i), is obtained by comparing the raw P- 
value for each of the motifs PcEii) to the smallest of the P-values found in the randomizations (across 
all motifs) as: 

7^(^) =Pr(Pci5(j) <-Pmin(p), I < P < S) . (5) 



Acknowledgments 

We would like to thank Jeffrey Edlund and Paul Sternberg for discussions, and two anonymous referees 
for helpful comments on the manuscript. This work was supported by the National Science Foundation's 
Frontiers in Biological Research Grant FIBR- 0527023, and by the NSFs BEACON Center for Evolution 
in Action under contract No. DBI-0939454. 



Colored Motifs in C. elegans 



13 



References 

1. Barabasi AL (2002) Linked: How Everything is Connected to Everything Else. Cambridge, MA: 
Perseus. 

2. Barabasi AL, Oltvai ZN (2004) Network biology: Understanding the cell's functional organization. 
Nat Rev Genet 5: 101-13. 

3. Newman M, Barabasi AL, Watts D (2006) The Structure and Dynamics of Networks. Princeton, 
N.J.: Princeton University Press. 

4. Alon U (2007) An Introduction to Systems Biology: Design Principles of Biological Networks. Boca 
Raton: Chapman and Hall/CRC. 

5. Hartwell LH, Hopfield JJ, Leibler S, Murray AW (1999) Prom molecular to modular cell biology. 
Nature 402: C47-52. 

6. Callcbaut W, Rasskin-Gutman D (2005) Modularity: Understanding the Development and Evolu- 
tion of Complex Systems. Cambridge, MA: MIT Press. 

7. Newman MEJ (2003) The structure and function of complex networks. SIAM Review 45: 167-256. 

8. Shen-Orr SS, Milo R, Mangan S, Alon U (2002) Network motifs in the transcriptional regulation 
network of Escherichia colt. Nat Genet 31: 64-68. 

9. Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskh D, et al. (2002) Network motifs: simple 
building blocks of complex networks. Science 298: 824-7. 

10. Rice JJ, Kershenbaum A, Stolovitzky G (2005) Lasting impressions: Motifs in protein-protein 
maps may provide footprints of evolutionary events. Proc Natl Acad Sci USA 102: 3173-4. 

11. Prill RJ, Iglcsias PA, Lcvchenko A (2005) Dynamic properties of network motifs contribute to 
biological network organization. PLoS Biol 3: e343. 

12. Milo R, Itzkovitz S, Kashtan N, Levitt R, Shen-Orr S, et al. (2004) Superfamilies of evolved and 
designed networks. Science 303: 1538-42. 

13. White J, Southgate E, Thomson J, Brenner S (1986) The structure of the nervous system of the 
nematode Caenorhabditis elegans. Philos Trans R Soc Lond Biol Sci 314: 1-340. 

14. Hall D, Russell R (1991) The posterior nervous system of the nematode Caenorhabditis elegans: 
Serial reconstruction of identified neurons and complete pattern of synaptic interactions. J Neuro- 
science 11: 1-22. 

15. Varshney LR, Chen BL, Paniagua E, Halland DH, Chklovskii DB (2009) Structural properties of 
the Caenorhabditis elegans neuronal network, preprint arXiv:0907.2373vl [q-bio.NC], arxiv.org. 

16. Reigl M, Alon U, Chklovskii DB (2004) Search for computational modules in the C. elegans brain. 
BMC Biol 2: 25. 

17. Sporns O, Kotter R (2004) Motifs in brain networks. PLoS Biol 2: e369. 

18. Song S, Sjostrom PJ, Reigl M, Nelson S, Chklovskii DB (2005) Highly nonrandom features of 
synaptic connectivity in local cortical circuits. PLoS Biol 3: e68. 

19. Wuchty S, Oltvai ZN, Barabasi AL (2003) Evolutionary conservation of motif constituents in the 
yeast protein interaction network. Nat Genet 35: 176-9. 



Colored Motifs in C. elegans 



14 



20. Ingram PJ, Stumpf MPH, Stark J (2006) Network motifs: Structure does not determine function. 
BMC Genomics 7: 108. 

21. Lichtman JW, Livet J, Sanes JR (2008) A technicolour approach to the connectome. Nat Rev 
Neurosci 9: 417-22. 

22. White EL (2002) Specificity of cortical synaptic connectivity: Emphasis on perspectives gained 
from quantitative electron microscopy. J Neurocytol 31: 195-202. 

23. Yoshimura Y, Callaway EM (2005) Fine-scale specificity of cortical networks depends on inhibitory 
cell type and connectivity. Nat Neurosci 8: 1552-9. 

24. Lee WP, Jeng BC, Pai TW, Tsai CP, Yu CY, et al. (2006) Differential evolutionary conservation 
of motif modes in the yeast protein interaction network. BMC Genomics 7: 89. 

25. Achacoso TB, Yamamoto WS (1992) AY's Neuroanatomy of C. elegans for computation. Boca 
Raton: CRC Press. 

26. Chase D, Koelle M (2007) Biogenic amine neurotransmitters in C. elegans. In: The 
C elegans Research Community, WormBook, editor, WormBook. http://www.wormbook.org: 
doi/10.1895/wormbook.l. 132.1. 

27. Richmond J (2007) Synaptic function. In: The C elegans Research Community, Wormbook, editor, 
WormBook. http://www.wormbook.org: doi/10.1895/wormbook. 1.69.1. 

28. WormAtlas (2002-2010) Wormatlas: A database of behavioral and structural anatomy. 
http://www.wormatlas.org. URL http://www.worinatlas.org, 

29. White JG (1985) Neural connectivity in Caenorhabditis elegans. Trends Neuroscience 8: 277-283. 

30. Tyson JJ, Chen KG, Novak B (2003) Sniffers, buzzers, toggles and blinkers: Dynamics of regulatory 
and signaling pathways in the cell. Gurr Opin Cell Biol 15: 221-31. 

31. Karbowski J, Schindelman G, Cronin CJ, Seah A, Sternberg PW (2008) Systems level circuit model 
of C. elegans undulatory locomotion: Mathematical modeling and molecular genetics. J Comput 
Neurosci 24: 253-276. 

32. Niebur E, Erdos P (1993) Theory of the locomotion of nematodes: Control of the somatic motor 
neurons by interneurons. Mathematical Biosciences 118: 51-82. 

33. Wernicke S (2006) Efficient detection of network motifs. IEEE/ACM Trans Comput Biol Bioinform 
3: 347-59. 

34. Wernicke S, Rasche F (2006) Fanmod: A tool for fast network motif detection. Bioinformatics 22: 
1152-3. 

35. Dudoit S, Shaffer J, Boldrick J (2003) Multiple hypothesis testing in microarray experiments. 
Statistical Science 18: 71-103. 

36. Westfall P, Young SS (1993) Resampling-based Multiple Testing: Examples and Methods for P- 
value Adjustment. New York: Wiley. 



Colored Motifs in C. elegans 



Supplementary Figures 




-200 -100 



100 200 300 400 500 



6000 




400 600 800 1000 



Figure SI. Histogram of normalized ^-scores z = ^^^/J^'^ of colored motif counts, where is the 
standard deviation of the count distribution with rt = 1, 000 randomizations. A: motifs of size 3, B: 
motifs of size 4. 



Colored Motifs in C. elegans 



16 



Figure S2. Rank and z-scores (un-normalized) for the 505 motifs of size 4 with single-step min P 
adjusted P-value P=0.0556 for 100,000 randomizations. Each of these motifs has a Pce = (see 
Methods) , which implies each of these motifs was more abundant in the C. elegans network than in any 
of the 100,000 randomized networks. But because there are 5,560 entries in Pmin that vanish, the 
adjusted P-value cannot be smaller than 0.0556. Increasing the number of randomizations leads to a 
smaller fraction of zeros in Pmin, and thus decreases the adjusted P-value of those motifs that have 
Pce ~ 0. (First 56 motifs only, remainder available upon request). 
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